Variable Selection via Penalized Likelihood

نویسندگان

  • Jianqing Fan
  • Runze Li
چکیده

Variable selection is vital to statistical data analyses. Many of procedures in use are ad hoc stepwise selection procedures, which are computationally expensive and ignore stochastic errors in the variable selection process of previous steps. An automatic and simultaneous variable selection procedure can be obtained by using a penalized likelihood method. In traditional linear models, the best subset selection and stepwise deletion methods coincide with a penalized leastsquares method when design matrices are orthonormal. In this paper, we propose a few new approaches to selecting variables for linear models, robust regression models and generalized linear models based on a penalized likelihood approach. A family of thresholding functions are proposed. The LASSO proposed by Tibshirani (1996) is a member of the penalized leastsquares with the L1-penalty. A smoothly clipped absolute deviation (SCAD) penalty function is introduced to ameliorate the properties of L1-penalty. A uni ed algorithm is introduced, which is backed up by statistical theory. The new approaches are compared with the ordinary leastsquares methods, the garrote method by Breiman (1995) and the LASSO method by Tibshirani (1996). Our simulation results show that the newly proposed methods compare favorably with other approaches as an automatic variable selection technique. Because of simultaneous selection of variables and estimation of parameters, we are able to give a simple estimated standard error formula, which is tested to be accurate enough for practical applications. Two real data examples illustrate the versatility and e ectiveness of the proposed approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Penalized Bregman Divergence Estimation via Coordinate Descent

Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...

متن کامل

Penalized Empirical Likelihood and Growing Dimensional General Estimating Equations

When a parametric likelihood function is not specified for a model, estimating equations provide an instrument for statistical inference. Qin & Lawless (1994) illustrated that empirical likelihood makes optimal use of these equations in inferences for fixed (low) dimensional unknown parameters. In this paper, we study empirical likelihood for general estimating equations with growing (high) dim...

متن کامل

A Connection Between Variable Selection and EM-Type Algorithms

Variable selection is fundamental to high-dimensional statistical modeling. Fan and Li (2001) proposed a class of variable selection procedures via nonconcave penalized likelihood. Optimizing the penalized likelihood function is challenging as it is a highdimensional nonconcave function with singularities. A new algorithm is proposed for finding a solution of the nonconcave penalized likelihood...

متن کامل

Dimension Reduction and Variable Selection in Case Control Studies via Regularized Likelihood Optimization

Dimension reduction and variable selection are performed routinely in case-control studies, but the literature on the theoretical aspects of the resulting estimates is scarce. We bring our contribution to this literature by studying estimators obtained via l1 penalized likelihood optimization. We show that the optimizers of the l1 penalized retrospective likelihood coincide with the optimizers ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999